Implementation of a Parallel NetCDF Interface for Seamless Collective Remote I/O
نویسنده
چکیده
In scientific applications, netCDF [1] was developed to support a view of data as a collection of self-describing, portable, and array-oriented objects that can be accessed through a simple interface. Its parallel I/O interface named parallel netCDF (hereafter PnetCDF) was developed with the help of an MPI-I/O library [2] such as ROMIO [3], and the PnetCDF succeeded in scientific computation [4]. But the same operations among computers which have different MPI libraries each other have not been available. To realize this mechanism, a remote MPI-I/O mechanism of a Stampi library [5] has been implemented in a PnetCDF library as an underlying MPI-I/O layer. The Stampi library was originally developed to support seamless MPI communications among different MPI libraries by deploying its wrapper interface library between a user program and an underlying communication library [6]. It intermediates MPI communications among different MPI libraries and hides complexity and heterogeneity in communication mechanisms among different platforms. It also supports MPI-I/O operations not only inside a computer using an underlying MPI library but also among computers which have different MPI libraries [5]. MPI-I/O calls in a user program are switched to corresponding Stampi's MPI-I/O functions in the wrapper library, and it considers which I/O operation is appropriate, local or remote I/O, automatically according to a target computer name and so on which are specified in an info object. A PnetCDF library has been linked with the Stampi's MPI-I/O functions to support seamless remote I/O operations via a PnetCDF API without paying attention to complexity and heterogeneity in underlying communication and I/O systems. When a PnetCDF function is called in a user program, the I/O call is translated into several Stampi's MPI function calls. In local I/O operations, parallel I/O operations are carried out using vendor's MPI library according to those function calls. If the vendor's one is not available, UNIX I/O functions are used instead of it. On the other hand, MPI-I/O processes are invoked on a remote computer in remote I/O operations by using a remote shell command (rsh or ssh) when ncmpi create() or ncmpi open() is called, followed by a function call of MPI File open() inside them. An I/O request from each user process is transfered to the corresponding MPI-I/O process, and each MPI-I/O
منابع مشابه
Implementing a Parallel NetCDF Interface for Seamless Remote I/O Using Multi-dimensional Data
Parallel netCDF supports parallel I/O operations for a view of data as a collection of self-describing, portable, and array-oriented objects that can be accessed through a simple interface. Its parallel I/O operations are realized with the help of an MPI-I/O library. However, such the operations are not available in remote I/O operations. So, a remote I/O mechanism of a Stampi library was intro...
متن کاملParallel netCDF: A Scientific High-Performance I/O Interface
Dataset storage, exchange, and access play a critical role in scientific applications. For such purposes netCDF serves as a portable and efficient file format and programming interface, which is popular in numerous scientific application domains. However, the original interface does not provide a efficient mechanism for parallel data storage and access. In this work, we present a new parallel i...
متن کاملFeasibility Study of Effective Remote I/O Using a Parallel NetCDF Interface in a Long-Latency Network
NetCDF provides portable and selfdescribing I/O data format for array-oriented data in scientific computation domains. Its parallel I/O interface named parallel netCDF (hereafter PnetCDF) provides parallel I/O operations with the help of an MPI interface. To realize such operations among computers which have different MPI libraries through a PnetCDF interface, a Stampi library was introduced as...
متن کاملEfficient Parallel I/O in Community Atmosphere Model (CAM)
2 Abstract Century-long global climate simulations at high resolutions generate large amounts of data in a parallel architecture. Currently, Community Atmosphere Model (CAM), the atmospheric component of the NCAR Community Climate System Model (CCSM), uses sequential I/O which causes a serious bottleneck for these simulations. We describe the parallel I/O development of CAM in this paper. The p...
متن کاملEvaluating structured I/O methods for parallel file systems
Modern data-intensive structured datasets constantly undergo manipulation and migration through parallel scientific applications. Directly supporting these time consuming operations is an important step in providing high-performance I/O solutions for modern large-scale applications. High-level interfaces such as HDF5 and Parallel netCDF provide convenient APIs for accessing structured datasets,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006